Search CORE

136 research outputs found

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Author: Fang Yuwei
Gan Zhe
Liu Jingjing
Sun Siqi
Wang Shuohang
Publication venue
Publication date: 15/12/2020
Field of study

Large-scale cross-lingual language models (LM), such as mBERT, Unicoder and XLM, have achieved great success in cross-lingual representation learning. However, when applied to zero-shot cross-lingual transfer tasks, most existing methods use only single-language input for LM finetuning, without leveraging the intrinsic cross-lingual alignment between different languages that proves essential for multilingual tasks. In this paper, we propose FILTER, an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. Specifically, FILTER first encodes text input in the source language and its translation in the target language independently in the shallow layers, then performs cross-language fusion to extract multilingual knowledge in the intermediate layers, and finally performs further language-specific encoding. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. For simple tasks such as classification, translated text in the target language shares the same label as the source language. However, this shared label becomes less accurate or even unavailable for more complex tasks such as question answering, NER and POS tagging. To tackle this issue, we further propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language. Extensive experiments demonstrate that FILTER achieves new state of the art on two challenging multilingual multi-task benchmarks, XTREME and XGLUE.Comment: Accepted to AAAI 2021; Top-1 Performance on XTREME (https://sites.research.google/xtreme, September 8, 2020) and XGLUE (https://microsoft.github.io/XGLUE, September 14, 2020) benchmar

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fuzzy Sparse Autoencoder Framework for Single Image Per Person Face Recognition

Author: Guo Yuwei
Jiao Licheng
Liu Fang
Wang Shuang
Wang Shuo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/08/2017
Field of study

The issue of single sample per person (SSPP) face recognition has attracted more and more attention in recent years. Patch/local-based algorithm is one of the most popular categories to address the issue, as patch/local features are robust to face image variations. However, the global discriminative information is ignored in patch/local-based algorithm, which is crucial to recognize the nondiscriminative region of face images. To make the best of the advantage of both local information and global information, a novel two-layer local-to-global feature learning framework is proposed to address SSPP face recognition. In the first layer, the objective-oriented local features are learned by a patch-based fuzzy rough set feature selection strategy. The obtained local features are not only robust to the image variations, but also usable to preserve the discrimination ability of original patches. Global structural information is extracted from local features by a sparse autoencoder in the second layer, which reduces the negative effect of nondiscriminative regions. Besides, the proposed framework is a shallow network, which avoids the over-fitting caused by using multilayer network to address SSPP problem. The experimental results have shown that the proposed local-to-global feature learning framework can achieve superior performance than other state-of-the-art feature learning algorithms for SSPP face recognition

Crossref

Birmingham City University Open Access Repository

University of Birmingham Research Portal

BCU Open Access

Fuzzy superpixels for polarimetric SAR images classification

Author: Guo Yuwei
Hua Wenqiang
Jiao Licheng
Liu Fang
Wang Shuang
Wang Shuo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/03/2018
Field of study

Superpixels technique has drawn much attention in computer vision applications. Each superpixels algorithm has its own advantages. Selecting a more appropriate superpixels algorithm for a speciﬁc application can improve the performance of the application. In the last few years, superpixels are widely used in polarimetric synthetic aperture radar (PolSAR) image classiﬁcation. However, no superpixel algorithm is especially designed for image classiﬁcation. It is believed that both mixed superpixels and pure superpixels exist in an image.Nevertheless, mixed superpixels have negative effects on classiﬁcation accuracy. Thus, it is necessary to generate superpixels containing as few mixed superpixels as possible for image classiﬁcation. In this paper, ﬁrst, a novel superpixels concept, named fuzzy superpixels, is proposed for reducing the generation of mixed superpixels.In fuzzy superpixels ,not al lpixels are assigned to a corresponding superpixel. We would rather ignore the pixels than assigning them to improper superpixels. Second,a new algorithm, named FuzzyS(FS),is proposed to generate fuzzy superpixels for PolSAR image classiﬁcation. Three PolSAR images are used to verify the effect of the proposed FS algorithm. Experimental results demonstrate the superiority of the proposed FS algorithm over several state-of-the-art superpixels algorithms

Crossref

Birmingham City University Open Access Repository

University of Birmingham Research Portal

BCU Open Access

Research on the Mechanism of Entrepreneurship Education on College Students’ Entrepreneurial Willingness and Its Future Prediction

Author: Fang Jiasheng
Gu Jieyu
Hu Mingjie
Liu Yingzhen
Xie Xinyi
Zou Yuwei
Publication venue: Internation Association of Online Engineering (IAOE)
Publication date: 21/12/2023
Field of study

The strength of college students’ entrepreneurial willingness is a barometer for measuring the effectiveness of entrepreneurship education. It is also an important avenue for college students to expand their employment opportunities and enhance the quality of their employment in the face of new employment trends. Comprehensive universities offer a wide range of disciplines and great professional specialization. It is of great significance to explore the diversity results in college students’ entrepreneurship education indicators. According to the data on the relationship between entrepreneurial education and entrepreneurship willingness in comprehensive universities in Jiangsu province, various factors such as subject characteristics, work experience, educational background, and family environment significantly impact college students’ willingness to become entrepreneurs. The implementation of entrepreneurship education, including the awakening of entrepreneurial consciousness, the cultivation of entrepreneurial abilities, and the improvement of entrepreneurial willingness, has a direct impact on college students’ willingness to start their own businesses

Online-Journals.org (International Association of Online Engineering)

Brain-inspired Graph Spiking Neural Networks for Commonsense Knowledge Representation and Reasoning

Author: Fang Hongjian
Liang Yao
Liu Xin
Tang Jianbo
Wang Yuwei
Zeng Yi
Publication venue
Publication date: 11/07/2022
Field of study

How neural networks in the human brain represent commonsense knowledge, and complete related reasoning tasks is an important research topic in neuroscience, cognitive science, psychology, and artificial intelligence. Although the traditional artificial neural network using fixed-length vectors to represent symbols has gained good performance in some specific tasks, it is still a black box that lacks interpretability, far from how humans perceive the world. Inspired by the grandmother-cell hypothesis in neuroscience, this work investigates how population encoding and spiking timing-dependent plasticity (STDP) mechanisms can be integrated into the learning of spiking neural networks, and how a population of neurons can represent a symbol via guiding the completion of sequential firing between different neuron populations. The neuron populations of different communities together constitute the entire commonsense knowledge graph, forming a giant graph spiking neural network. Moreover, we introduced the Reward-modulated spiking timing-dependent plasticity (R-STDP) mechanism to simulate the biological reinforcement learning process and completed the related reasoning tasks accordingly, achieving comparable accuracy and faster convergence speed than the graph convolutional artificial neural networks. For the fields of neuroscience and cognitive science, the work in this paper provided the foundation of computational modeling for further exploration of the way the human brain represents commonsense knowledge. For the field of artificial intelligence, this paper indicated the exploration direction for realizing a more robust and interpretable neural network by constructing a commonsense knowledge representation and reasoning spiking neural networks with solid biological plausibility

arXiv.org e-Print Archive

Fuzzy Superpixels based Semi-supervised Similarity-constrained CNN for PolSAR Image Classification

Author: Guo Yuwei
Jiao Licheng
Liu Fang
Qu Rong
Sun Zhuangzhuang
Zhang Xiangrong
Publication venue: 'MDPI AG'
Publication date: 25/05/2020
Field of study

Recently, deep learning has been highly successful in image classification. Labeling the PolSAR data, however, is time-consuming and laborious and in response semi-supervised deep learning has been increasingly investigated in PolSAR image classification. Semi-supervised deep learning methods for PolSAR image classification can be broadly divided into two categories, namely pixels-based methods and superpixels-based methods. Pixels-based semi-supervised methods are liable to be affected by speckle noises and have a relatively high computational complexity. Superpixels-based methods focus on the superpixels and ignore tiny detail-preserving represented by pixels. In this paper, a Fuzzy superpixels based Semi-supervised Similarity-constrained CNN (FS-SCNN) is proposed. To reduce the effect of speckle noises and preserve the details, FS-SCNN uses a fuzzy superpixels algorithm to segment an image into two parts, superpixels and undetermined pixels. Moreover, the fuzzy superpixels algorithm can also reduce the number of mixed superpixels and improve classification performance. To exploit unlabeled data effectively, we also propose a Similarity-constrained Convolutional Neural Network (SCNN) model to assign pseudo labels to unlabeled data. The final training set consists of the initial labeled data and these pseudo labeled data. Three PolSAR images are used to demonstrate the excellent classification performance of the FS-SCNN method with data of limited labels

Repository@Nottingham

Evaluating Very Long-Term Conversational Memory of LLM Agents

Author: Bansal Mohit
Barbieri Francesco
Fang Yuwei
Lee Dong-Ho
Maharana Adyasha
Tulyakov Sergey
Publication venue
Publication date: 27/02/2024
Field of study

Existing works on long-term open-domain dialogues focus on evaluating model responses within contexts spanning no more than five chat sessions. Despite advancements in long-context large language models (LLMs) and retrieval augmented generation (RAG) techniques, their efficacy in very long-term dialogues remains unexplored. To address this research gap, we introduce a machine-human pipeline to generate high-quality, very long-term dialogues by leveraging LLM-based agent architectures and grounding their dialogues on personas and temporal event graphs. Moreover, we equip each agent with the capability of sharing and reacting to images. The generated conversations are verified and edited by human annotators for long-range consistency and grounding to the event graphs. Using this pipeline, we collect LoCoMo, a dataset of very long-term conversations, each encompassing 300 turns and 9K tokens on avg., over up to 35 sessions. Based on LoCoMo, we present a comprehensive evaluation benchmark to measure long-term memory in models, encompassing question answering, event summarization, and multi-modal dialogue generation tasks. Our experimental results indicate that LLMs exhibit challenges in understanding lengthy conversations and comprehending long-range temporal and causal dynamics within dialogues. Employing strategies like long-context LLMs or RAG can offer improvements but these models still substantially lag behind human performance.Comment: 19 pages; Project page: https://snap-research.github.io/locomo

arXiv.org e-Print Archive